Multimodal Analysis Using Redundant Parametric Decompositions

نویسندگان

  • Oscar Divorra Escoda
  • Pierre Vandergheynst
  • Gianluca Monaci
چکیده

In this work we explore the potentialities of a representational framework for the decomposition of audio-visual signals over redundant dictionaries, using Matching Pursuits [1] (MP). It is relatively easy for a human to correctly interpret a scene consisting on a combination of acoustic and visual stimuli and to take profit from both information to experience a richer perception of the world. On the contrary, computer systems have considerable difficulties when having to deal with multimodal signals, and the information that each component contains about the others is usually ignored. This is basically due to the complexity of the dependencies that exist between audio and video signals and to the signals representations that are considered when attempting to mix them in multimodal fusion systems. Redundant decompositions may describe audio-visual sequences in an extremely concise fashion, preserving good representational properties thanks to the use of redundant, well designed, dictionaries. We expect that this will help us to overcome two typical problems of multimodal fusion algorithms, that are the high dimensionality of the considered signals and the limitations of classical representation techniques, like pixel-based measures (for the video) or Fourier-like transforms (for the audio), that take into account only marginally the physics of the problem. The experimental results we obtain by making use of MP decompositions over redundant codebooks are encouraging and make us believe that such a research direction would allow to open a new way through multimodal signal representation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Tensor to Coupled Matrix/Tensor Decomposition

Decompositions of higher-order tensors are becoming more and more important in signal processing, data analysis, machine learning, scientific computing, optimization and many other fields. A new trend is the study of coupled matrix/tensor decompositions (i.e., decompositions of multiple matrices and/or tensors that are linked in one or several ways). Applications can be found in various fields ...

متن کامل

Parametric reduction complexity of Volterra models using tensor decompositions

Discrete-time Volterra models play an important role in many application areas. The main drawback of these models is their parametric complexity due to the huge number of their parameters, the kernel coefficients. Using the symmetry property of the Volterra kernels, these ones can be viewed as symmetric tensors. In this paper, we apply tensor decompositions (PARAFAC and HOSVD) for reducing the ...

متن کامل

Multimodal medical image fusion based on Yager’s intuitionistic fuzzy sets

The objective of image fusion for medical images is to combine multiple images obtained from various sources into a single image suitable for better diagnosis. Most of the state-of-the-art image fusing technique is based on nonfuzzy sets, and the fused image so obtained lags with complementary information. Intuitionistic fuzzy sets (IFS) are determined to be more suitable for civilian, and medi...

متن کامل

Rough Hypercuboid Based Supervised Regularized Canonical Correlation for Multimodal Data Analysis

One of the main problems in real life omics data analysis is how to extract relevant and non-redundant features from high dimensional multimodal data sets. In general, supervised regularized canonical correlation analysis (SRCCA) plays an important role in extracting new features from multimodal omics data sets. However, the existing SRCCA optimizes regularization parameters based on the qualit...

متن کامل

Optimized co-registration method of Spinal cord MR Neuroimaging data analysis and application for generating multi-parameter maps

Introduction: The purpose of multimodal and co-registration In MR Neuroimaging is to fuse two or more sets images (T1, T2, fMRI, DTI, pMRI, …) for combining the different information into a composite correlated data set in order to visualization, re-alignment and generating transform to functional Matrix. Multimodal registration and motion correction in spinal cord MR Neuroimag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004